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2 Abstract 

For a graph G, let pi(G), i = 0, 3 be the probability that three distinct random vertices span 
exactly i edges. We call (p (G), ...,ps(G)) the 3-local profile of G. We investigate the set 53 C M 4 
of all vectors (pa, ...,P3) that are arbitrarily close to the 3-local profiles of arbitrarily large graphs. 
We give a full description of the projection of S3 to the {po,P3) plane. The upper envelope of this 
planar domain is obtained from cliques on a fraction of the vertex set and complements of such 
graphs. The lower envelope is Goodman's inequality pq +P3 > \. We also give a full description of 
the triangle- free case, i.e., the intersection of S3 with the hyperplane p 3 = 0. This planar domain 
is characterized by an SDP constraint that is derived from Razborov's flag algebra theory. 
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^ 1 Introduction 

-i— > 

a 

For graphs H, G, we denote by d(H; G) the induced density of the graph H in the graph G. Namely, 
the probability that a random set of \H\ vertices in G induces a copy of the graph H. 
t— 1 Many important problems and theorems in graph theory can be formulated in the framework 

of graph densities. Most of the emphasis so far has been on edge counts, or what is the same, on 
maximizing d(K2]G) subject to some restrictions. Thus Turan's theorem determines maxd(i\2;G) 
under the assumption d(K s ; G) = for some s > 3. The theorem further says that the optimal graph 
is the complete balanced (s — l)-partite graph. This was substantially extended by Erdos and Stone 
[6J who determined max<i(i\2; G) under the assumption that the .ff-density (not induced) of G is zero 
for some fixed graph H. Their theorem also shows that the answer depends only on the chromatic 
number of H. Ramsey's theorem shows that for any two integers r, s > 2, every sufficiently large 
graph G has either d(K s ,G) > or d(K r ,G) > 0. The Kruskal-Katona Theorem [T5| HB]. can be 
stated as saying that d(K r ;G) = a implies that d(K s ;G) < a s l T for r < s. Finding mmd(K s ;G) 
under the assumption d(K r ; G) = a turns out to be more difficult. The case r = 2 of this problem was 
solved only recently in a series of papers by Razborov [21], Nikiforov [T7] and Reiher [23] . A closely 
related question is to minimize d(K s ; G) given that d(K r ; G) = a for some real a G [0, 1] and integers 
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r, s > 2. The case a = of this problem was posed by Erdos more than 50 years ago. Although, 
recently Das et al [3], and independently Pikhurko [18] . solved it for certain values of r and s it is 
still widely open in general. 

Numerous further questions concerning the numbers d(H; G) suggest themselves. Thus Good- 
man [10] showed that mine d(K 3 ; G) + d(K 3 ;G) = 1/4 — o(l). As a random G(n, |) graph shows, 
this bound is tight. Erdos [5] conjectured that a G(n, ^) graph also minimizes d(K r ; G) + d(K r ; G) 
for all r, but this was refuted by Thomason |25] for all r > 4. A simple consequence of Goodman's 
inequality is that mmc raax{d(K^;G), d(K 3; G)} = 1/8. The analogue statement for r = 4 is not 
true as can be shown using an example of Franek and Rodl [8] (see [2] for the details). On the 
other hand, the max-min version of this problem is now solved. As we have recently proved |14j . 
maxc mm.{d(K r ; G), d(K r ; G)} is obtained by a clique on a properly chosen fraction of the vertices. 

Closely related to these questions is the notion of inducibility of graphs, first introduced in [19J. 
The inducibility of a graph H is defined as lim T n >00 max^ d(H; G), where the maximum is over all 
n- vertex graphs G. This natural parameter has been investigated for several types of graphs H . E.g., 
complete bipartite and multipartite graphs [JJ [2] , very small graphs [2 [13] and blow-up graphs [UJ . 

In light of this discussion, the following general concept suggests itself. 

Definition 1.1. For a family of finite graphs 7~L = (Hi, H t ), let d(T~L; G) := (d(H\\ G), . . . d(H t ; G)). 
Define A(T~L) to be the set of all p = (pi, ■■■,Pt) £ [0, 1]* for which there exists a sequence of graphs G n , 
such that \G n \ — > 00 and d(T~L; G n ) — > p- We likewise define Ag(T~L) where we require that G n G Q , an 
infinite families of graphs of interest (e.g. K s -free graphs). 

The initial discussion suggests that it may be a very difficult task to fully describe A('H) or Ag(7i). 
Indeed, it was shown by Hatami and Norine |12j that in general it is undecidable to determine the 
linear inequalities that such sets satisfy. In this paper we solve two instances of this question. 

We denote by Pi(G) the probability that three distinct random vertices in the graph G span 
exactly i edges. The first theorem describes the possible distributions of 3-cliques and 3-anticliques in 
graphs (i.e., of (po,P3))- We have Goodman's inequality [UJ] as a lower bound, and an upper bound 
from [14] . We show that these bounds fully describe all possible (po,Pz). 

Theorem 1.2. For p £ [0, 1], let (3 be the unique root in [0, 1] of (3 3 + 3/3 2 (l - ft) = p . Then, 
(p , P3 )eA(K 3 ,K 3 ) iff 

Po+P3>\ and p 3 < max{(l - p 1/3 ) 3 + 3p 1/3 (l - R) 1/3 ) 2 , (1 - Pf}- 

The analogous question concerning A(K r , K r ) for r > 3 is widely open. While the analogues upper 
bound is proved in [14] , the situation with respect to the lower bounds is still poorly understood [23] . 

The second theorem in this paper is proved using the theory of flag algebras [20] . This theory 
provides a method to derive upper bounds in asymptotic extremal graph theory. This is accomplished 
by generating certain semidefinite programs (=SDP) that pertain to the problem at hand. By passing 
to the dual SDP we derive necessary conditions for membership in A(T~L) or Ag(%). Section [4] contains 
a self contained discussion, covering this perspective of the theory of flag algebras. 
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The theorem below demonstrates the special role that bipartite graphs play in the study of triangle- 
free graphs. As the theorem shows, all 3-local profiles of triangle-free graphs are realizable as well by 
bipartite graph. Moreover, the theory of flag algebras provides a complete answer to this question. 
This yields a different perspective to the fact that almost all triangle- free graphs are bipartite [4J. 
We denote the 3- vertex path by P3 and its complement by P3. Also, as usual, A ^ means that 
the matrix A is positive semi-definite (=PSD). The class of bipartite (resp. triangle- free) graphs is 
denoted by BV (resp. TT\ 

Theorem 1.3. For Po,pi,P2 > s.t. Po+ pi+ P2 = 1> the following conditions are equivalent: 
I (Po,Pi,P2) € Ar^i^^Pa) 

n. Poll °A+pi\1 






///. (p ,pi,p 2 ) € A B p(Z3,P3,P; 



The remainder of this paper is organized as following. In Section [2] we use random graphs to show 
the realizability of Ag(H). In section [3] we prove Theorem 1.2 In Section [4] we use the theory of flag 



algebras to derive SDP constraints on membership in Ag(H). In section [5] we prove Theorem 1.3 
We close with some concluding remarks and several open problems. 



2 Random Constructions 

Let H = (Hi, H t ) be a collection of graphs, and p = (pi,...,p t ) 6 [0,1]'. In order to prove that 
p 6 Ag(7i), we need arbitrarily large graphs G for which \\d(T~L;G) — p\\ is negligible. We accomplish 
this using appropriately designed random G's. Let LI be a symmetric nxn matrix with entries in [0, 1] 
and zeros along the diagonal. Corresponding to II is a distribution G(LI) on n-vertex graphs where 
ij is an edge with probability IL j and the choices are made independently for all n > i > j > 1. We 
say that a graph G is supported on LI if G is chosen from G(LI) with positive probability. 

Lemma 2.1. For every list of graphs Hi, ...,Ht there exists an integer No such that if n > iVo and 
LI is an n x n matrix as above, then there exists an n-vertex graph G* supported on LI such that 



V i = 1, ...,t 



d(Hi-,G*)- E \d(H f ,G)] 



< 



1 



We note that the statement need not hold if G(LI) is replaced by an arbitrary distribution on 
n-vertex graphs. 

Proof. Fix a graph H. Let us view an n-vertex graph G as a (^-dimensional binary vector. The 
mapping G ^ d(H;G) has Lipschitz constant ('f J/G)- We 

can therefore apply Azuma's inequality 

and conclude that 



Pr 

G*~G(n) 



d(H; G*) - E \d(H;G)\ 
G~G(n) 



> 



< 2exp 



(2) 
2n(l?) 
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Using the union bound and denoting h = max \Hi\, we get 

1 



Pr 

G*~G(n) 



dCH; G*) — E \d(H;G)] 
G~c?(n) 



< 



> 1 - 2t exp 



) \ 

= l-ojl 



2n( 



(1)- 



□ 



This lemma is easily generalized for hypergraphs of greater uniformity. 



3 Distribution of 3-cliques and 3-anticliques 



In this section we prove Theorem 1.2 and produce a full description of the set A(K 3 , K 3 ). We state 



the known lower and upper bounds and show that they fully describe this set. 
Theorem 3.1 (Goodman [10]). For every n-vertex graph G 

Po(G)+ P3 (G)>]-o(- 
4 \n 

Theorem 3.2 ([14]). Let r, s > 2 be integers and suppose that d(K r ;G) > a where G is an n-vertex 
graph and 1 > a > 0. Let (3 be the unique root of f3 r + r/3 r_1 (l — (3) = a in [0, 1]. Then 



d(K s ; G) < max{(l - a 1/r ) s + sa i/r (l - a i/r ) s ~\ (1 - PY} + o(l) 

Namely, given d(K r ;G), the maximum of d(K s ;G) is attained upto a negligible error-term either by 
a clique on some subset of the n vertices, or by the complement of such a graph. In particular, for 
every G 

p 3 (G) < max{(l - p (G) 1/3 ) 3 + 3p (Gf 3 (l - p (G) 1/3 ) 2 , (1 - /3) 3 }, 
where (3 is the unique root of /3 3 + 3/3 2 (l — (3) = po(G) in [0, 1]. 



l/r, 



Proof of Theorem L2_, Let C\ , C2 be the (po, Pz) curves induced by cliques and complements of cliques 
resp. 

d = {((1 - x) 3 + 3(1 - x) 2 x,x 3 ) I xe [0,1]} 
C 2 = {(x 3 ,(l-xf + 3(l-x) 2 x) I x e [0,1]} 

For i = 1, 2 let Bi C [0, l] 2 be the region bounded by po > 0, p 3 > 0, po + P3 > |> and by Cj. We 
need to prove that 

A(K 3 ,K 3 ) = B l UB 2 . 



By Theorems [3TT] and [372] A C^u B 2 . 

We show that every point in this domain can be approximated arbitrarily well by (po(G) , p 3 (G)) 
for arbitrarily large G. We define the following parameterized family of random graphs: 
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0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



Figure 1: The curves that bound the region of A (K 3, K$), and the auxiliary curve C used in the 
proof 



Definition 3.3. For every x,a,b,c G [0,1], G XjCIj 6 iC , is the class of random graphs (V,E), where 
V = ACiB with \ A\ = x\V\ and \B\ = (1 — x)|V|. Adjacencies are chosen independently among pairs, 
with 

a i,j G A 

b ij'e-B 
c i G A, j G B or vice versa 



Pr(ij G E) = < 



A simple computation shows that 
®Po(Gx,a,b,c) = - af + (1 - x) 3 (l - bf + 3x 2 (l - x)(l - a)(l - c) 2 + 3x(l - x) 2 (l - 6)(1 - c) 2 + o(l) 
and 

^P3(G x>a ,b,c) = x 3 a 3 + (1 - x) 3 6 3 + 3x 2 (l - x)ac 2 + 3x(l - xfbc 2 + o(l) 

By Lemma 2.1 (Epo(Cx,o,6,c)) ^P3(Ca;,a,6,c)) G ^(^3,^3) for every (x, a, 6, c) G [0, l] 4 . The following 
curve is used in the proof. 

C" = {(t 3 ,(l-t) 3 ) I * € [0,1]} 
Consider the following continuous map, 

ff: [0,1] x [0,1] -^A(Ks,K 3 ) 

H(x,a) = (E[po(G x,o,l— 0,1— a 

)],l[p3(Gx,«,l-a,l-«)])- 

The following claims are immediate. 

1. H(x,0) = (x 3 ,(l-x) 3 + 3(l-x) 2 x). 

2. H(x, 1) = ((1 - x) 3 + 3(1 - x) 2 x, x 3 ) 

3. H(l,a) = ((1 - a) 3 , a 3 ) 
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4. £) = (!,!). 

5. if(0,a) = (a 3 ,(l - a) 3 ) 

i7 |[oi] x [o i] * s a continuous map from a topological 2-disc. The boundary of this disk is mapped 
to a path encircling C2 U C". Therefore, C2 U C" is contractible in im(H), and consequently the 
region bounded by C2 and C is contained in im(H), and also in A(K%, K3). A similar argument for 
H |j i] x ri 1] shows that the region bounded by C\ and C' is contained in A(K 3 , K3). 
The remaining area in A(K 3 , K3) will be covered similarly. Consider the following continuous map, 

ffi : [0,1] x [0,1] -> A(K~ 3 ,K 3 ) 

Hl(x,a) = (E[po(G X:aA ,l-a)],HP3(G X: a,a,l-a)]) ■ 

Again, the following claims are immediate. 

1. iZi(x,0) = (x 3 + (l-x) 3 ,0) 

2. ffi(§,a) = §(l-(2a-l) 3 ,l + (2a-l) 3 ) 

3. fTi(x,l) = (0,x 3 + (l-x) 3 ) 

4. i/ 1 (0,a) = ((l-a) 3 ,a 3 ) 

i?i I [0 I] x [0 1] ^ s a continuous map from a topological 2-disc, mapping its boundary to a path encircling 
C, [|,1] x {0},{|,^ I t G [0,1]} and {0} x Therefore, as before, the region between these 

curves is contained in A{K 3 ,K 3 ). Altogether, B\ U B2 C A(i^3,K,3) is obtained. 

□ 

4 Flag algebras - a dual perspective 

Let Q be a infinite family of graphs closed under taking induced subgraphs, let % = (Hi,...,H t ) 
a collection of graphs. We formulate necessary conditions for membership in the set Ag(H) which 
are stated in terms of feasibility of some SDP. This part is self-contained, and concentrates on the 
connections between the theory of flag algebras and standard arguments in discrete optimization. 

Definition 4.1. An (s, k) -flagged graph F = (H,U) consists of an s-vertex graph H and a flag 
U = (u±, ...,Uk), an ordered set of k vertices in H. An isomorphism F = F' between flagged graphs 
F = (H, U) and F' = (H' , U') is a graph isomorphism 

(p : V(H) — ► V(H') such that <p( Ui ) = u\ Vi. 

Definition 4.2. Let G be a graph and F\,F2 be (s, k)-flagged graphs. Choose uniformly at random 
two subsets (Vi, V2) ofV(G) of size s with intersection U = VidV-i of cardinality k and choose random 
ordering of U . Define 

p(F u F 2 ; G) = Pr [Ft = (G\ Vi , U) i = 1, 2] . 
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Associated with every list Pi, Fi of (s, k)-flagged graphs is the I x I matrix A G = A G (F\, F{) 

Vi,j A G (F 1 ,...,F l ) h3 =p{F i ,F 3 ;G). 
Note that A G is a symmetric matrix. 

Example 4.3. Denote by e (resp. e) the edge (its complement) with one flagged vertex. Also, P3 
denotes the path on 3 vertices. Then 

A^{e,e)={\ f) 

Proof. Let Vi,V 2 C V(P 3 ) be chosen randomly with \Vi\ = \V 2 \ = 2 and \Vi n V 2 \ = 1. First, 
p(e, e; P3) = since either Vi or V2 spans an edge. Also, p(e, e; P3) = | since both sets spans an edge 
iff their common vertex has degree 2. Finally, p(e, e; P3) = | since the common vertex has degree 1 
with probability 2/3, and conditioned on that, the first set V\ spans an edge with probability 1/2. □ 

We denote by PSD(l) the cone of / x I positive semi-definite matrices. 

Theorem 4.4. Let Fi, i = 1, I, be (s, k)-flagged graphs. For an n-vertex graph G, 



dist (A G (F 1 , F t ), PSD(l)) = O (^j 



where dist stands for distance in l 2 . 

Corollary 4.5. Let Q be a class of graphs that is closed under taking induced subgraphs. Let Q n be 
the set of n-vertex members of Q . Let % = (Hi, Ht) be a complete list of all the isomorphism types 
of graphs in Q r . Let Fi, i = l,...,l be (s, k)- flagged graphs. Then for every (p±, ...,pt) € Ag(H), 

t 

5>a-A H «(Fi,..,F,)>:0. 

a=l 

Let us illustrate how this corollary helps us derive an upper bound on limn^oo maxc e g n d(H; G) 
for some fixed graph H. (The limit exists since maxc e g n d(H; G) is a non-increasing function of n). 
Note that 

t 

d(H;G) = Y,d(H;H a )d(H a ;G). 
0=1 

Therefore the following SDP yields an upper bound 



max ^d(H;H a )p a s.t. 
a=l 

ah Pa > and p a = 1 
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YjPoc-A H "{F x ,...,Fi)yQ 

a=l 

By SDP duality, this maximum can be also upper-bounded by 



mm 

Q&PSD(l) 



fmax [d(H; H a ) + Tr(Q • A Ha )] 



which is the more familiar form of SDP used in the literature on applications of flag algebras (see, 
e.g., [221 [3] ). The proofs of the above two statements are based on standard arguments. 



Theorem \4-4\ =^ Corollary 4-5 First we prove that for every graph G on at least r vertices, 

t 

A G (F l , F t ) = ^ d(H a ; G)A Ha (Fx, F { ). 

a=X 

Namely, that for every 1 < i,j < I, 

t 

p(Fi,Fj;G) = Y,d(H a ;G)p(F i ,F f ,H a ). 
a=l 

This is just an application of the law of total probability. On the LHS we sample uniformly two sets 
Vx, V-2 of size s with \Vx H V2I = k from V[G) together with a random ordering of Vx D V2, and on 
the RHS we first sample a random set V of size r from V(G), and then uniformly sample Vi, V2 as 
above from V . To finish the proof, let p = (pi,...,pt) G Ag(H). By the definition of Ag(H) and 
Theorem 4.4, for every e > there is a sufficiently large graph G £ G such that both 

|p a -d(fT a ;G)|<e Va 

and 

dist (^2d(H a ;G)A Ha ,PSD(l) ) j < e. 
Therefore, dist (£ Q , PSD(l)) =0. □ 



Proof of Theorem \4-4\ Let G be an n- vertex graph. Consider the following equivalent description of 
the underlying distribution in the definition of the matrix A G = A G (Fx, ...,Fi). Choose uniformly at 
random an ordered set U C V(G) of size k, two disjoint sets Si, S2 C V(G) \ U of size s — k and let 
Vi = SiUU, i = l,2. Thus Afj is the probability that Fi = (G\ Vi ,U), for i = 1, 2. Note that for every 
fixed f7, two sets Sx,S% C V(G) \ [/ of size s — k chosen uniformly and independently at random 
are disjoint with probability 1 — 0(l/n). Therefore, it suffices to prove that the matrix B G , defined 
exactly as A G except that Sx,S2 are chosen independently, is PSD. 

Consider the matrix Q with I rows and r^tm columns indexed by ordered sets U C V(G) of size 
k, defined as following. Choose a random subset S C V(G) \ U of size s — k, and let Qi t u = Pr[Fj = 
(G\suu,U)}. Then, 

B G = ^»QQ T h 

□ 



8 



5 Triangle-free graphs 



In this section we prove Theorem 1.3, by showing that the set Afj^(Ks, P3, P3) is characterized by 
the quadratic constraints deduced from the flag algebra theory. 



Proof of Theorem 1.3 (I) =>- (II)- This implication is a direct application of Corollary 4.5 Let 



e,e be (2,l)-flagged graphs, e (resp. e) is the empty (complete) graph over 2 vertices with one flagged 



vertex. By a straightforward computation (See example 4.3), 



A K3 (e,e) 



1 o 1 

0, 



Afl»(g,e)=U J 



A P3 (e,e) - I ° 3 



1 1 

3 3, 



/fa 



^3 



P 3 



©■ 



Figure 2: Triangle free graphs and flagged graphs used in the proof 



Since these are all the graphs on 3 vertices in the family TJ-, we may apply Corollary 4.5 and 
obtain (//). 



(II) =>• (III)- Suppose po,pi,P2 satisfy the condition in (II). Since po + pi + P2 = 1, this can 
be reformulated as 

(3po+pi 1 - po 



1 - Po 1 - Po ~ Pi, 



>=0, 



which implies that 



0< (3 Po +Pi)(l-Po-Pi)-(l-Po) 2 , 



(1) 



Recall Definition 3.3 of G X:a ,b,ci and denote := G ai o,o,<j a distribution on bipartite graphs, for 
a, q £ [0, 1]. Then, 

nPo(G a , q )] = 1 - 3«(1 - a)q(2 -q) + o(l). 



E[pi(G Q ,,)] = 6a(l - a)g(l - g) + o(l). 
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By Lemma Q for every a,q, (E[p (G aj ,]),E[p 1 (G aiff )],E[p 2 (G a , 9 )]) G A BV (K 3 , P 3 , P 3 ). Thus, it 
suffices, given po,pi that satisfy (fij), to find (a, q) G [0, l] 2 such that, 



Po = 1 — 3a(l — a)q(2 



and pi = 6a(l — a)q(l 



This implies that 



and 



2-2p - 2pi 



e[o,i] 



1 - 2a 



2 - 2p - Pi 
2 (3po + Pi)(l ~Po -Pi) - (1 ~Po) 2 



3(1 -p - pi) 

Miraculously, a G [0, 1] that satisfies this equation exists iff the quadratic constraint in ([!]) are satisfied 
and po +Pi < 1- Indeed it is easy to check that in this case the right hand side is non- negative and 
is < 1. On the other hand, if po + pi = 1, then by ([TJ po = 1 and this profile is attained for q = 0. 





/ \^ ^—SDP constraint 














\\\\\\\ 






s 


\ \ \ \ \ \ \ ™\ 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 



Figure 3: The region of possible po,Pi of triangle- free graphs 
(///) =^ (I). Immediate, since every bipartite graph is triangle free. 



□ 



6 Concluding remarks 

In this paper we study the set 1S3 C M 4 of all vectors (po, ...,P3) that are arbitrarily close to the 3-local 
profiles of arbitrarily large graphs. We show that the projection of this set to the (po,P3) plane is 
completely realizable by the graphs that are generated by a model which partitions the vertices into 
two sets. We also show that the intersection of 1S3 with the plane p 3 = 0, i.e. triangle-free graphs, 
is completely realizable by a simple model of random bipartite graphs. We wonder how far these 
observations can be extended. Razborov's work [21] shows that certain 3 profiles require the use of 
fc-partite models for arbitrarily large k. Also in general, it is not true that a A:-local profile of every 
.Kfc-free graph can be realized by (k — l)-partite graph. Indeed, it was shown in [3], that already for 
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k > 4 the minimum density of empty sets of size k in i^-free graphs is strictly smaller than what 
can be achieved by (k — l)-partite graphs. 

It still remains a challenge to get a full description of the set S3. The analogous questions 
concerning r-profiles, r > 3 seems even more difficult. Even characterizing the profiles of (r-cliques, 
r-anticliques), which is solved here for r = 3, is still widely open. 
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